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ABSTRACT 

Implementation of a computer-based model for 
morphological analysis and synthesis of language, entitled P-KIMMO, 
is discussed. The model was implemented in Quintus Prolog on a Sun 
Workstation and exported to a Macintosh computer. This model has two 
levels of morphophonological representation, lexical and surface 
levels, associated by morphophonological rules that specify 
legitimate pairs of characters. The description offered here focuses 
on aspects of implementation only and not underlying theory. 
Components of the program are described, including structure of the 
lexicon, use of finite state automata to encode two-level rules, and 
the recognizer/generator algorithm. This version of the program is 
then compared and contrasted with a previously implemented version. 
Finally, procedures for use of the program on the UNIX and Mackintosh 
computers are outlined, with some screen illustrations. A brief 
bibliography is included, and a source listing of the P-KIMMO system 
is appended. (MSE) 
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1, Introduction 



This report decribes a Prolog implementation of the two-level model 
for morphological analysis and synthesis developed by Kimmo Koskenniemi 
(1983). The two level model was originally implemented in Pascal by 
Koskenniemi himself. Karttunen and his students developed a LISP implemen- 
tation, which was named "KIMMO" after its originator. Their work was pub- 
lished in "Texas Linguistic Forum" (1983). A Prolog implementation of the 
formalism was done by Boisen (1988). A comparison between the two Prolog 
implementations will be given in section 4. Quite recently, a C implementation 
by Antworth (1990) has been commercially available. Some testing results are 
also given in section 4. 

The two-level system described in this report, which I will call H P- 
KIMMO" has been implemented in Quintus Prolog on a Sun Workstation, and 
exported to the Macintosh computer. An additional routine which harnesses 
P-KIMMO with a menu-driven user interface was written for the Macintosh 
version (see section. 5.2). The source code is listed in the appendix of this 
report. It can also be obtained as an ASCII file on a Macintosh formatted 3 1/2" 
disk. Requests should be sent to: 

Kang-Hyuk Lee 

Department of Linguistics 

UrJversity of Illinois 

4088 Foreign Languages Building 

707 S. Mathews 

Urbana, IL 61801 

E-mail: klee@lees.cogsci.uiuc.edu 

The current version of P-KIMMO has been integrated into the UNICORN 
natural language processing system (Gerdeman and Hinrichs 1988) as the 
morphological component. Research is ongoing to empower P-KIMMO to do 
morphological analysis with an on-line dictionary. 



2. The Two-Level Formalism 

Since the purpose of this report is to describe the implementational 
aspects of P-KIMMO, I will not attempt to provide a detailed description of the 
two- level formalism. Rather, I refer the reader to Koskenniemi (1983) for the 
full exposition of the formalism. Karttunen (1983) also is a valuable source. 
The description given in this section is intended for those who are not familiar 
with the two-level formalism so that they get the flavor of it. 

As suggested in its nomenclature, the two-level model has two levels of 
morphophonological representations: the lexical level and surface level. 
These two levels are associated by morphophonological rules which specify 
legitimate pairs of characters, as illustrated in figure 2.1. 



lexical : s p y + s 







two -level rules 


s p : 


e s 



Figure 2.1 

The general format of two level rules is given in figure 2.2. CP which stands 
for "correspondence" refers to a lexical/surface pair. LC and RC refer to the 
left and right environment, respectively. *OP* is a logical operator which is 
instantiated as <--> ("if and only if") in many cases. 1 What this operator says is 
that CP is obligatory in the given context and is possible only in that context. 
Note that LC and RC also are character pairs. 

CP *OP* LC RC 



Figure 2.2: The general format of two-level rules 

The two-level rule that legitimizes the y/i pairs (or, the y/i alternation in 
generative-phonological parlance) in figure 2.1 can be put in prose as follows 
(cited from Karttunen and Wittenburg 1983). 

Y -replacement: y/i <--> C +/= - {i, a} 

After a consonant , lexical y corresponds to i 
when a lexical suffix marker and any pair 
other than i/i or a/a follows; to y elsewhere. 

Figure 2.3 

The capital "C" stands for all tht consonant pairs. "+/=" abbreviates the pairs 
consisting of the suffix marker plus any character.^ To sum up, two-level 
rules express correspondences between lexical and surface forms. This corre- 
spondence relation between two characters is a major departure from tradi- 
tional generative phonology and characteristic of two-level rules. 



3. Components of P-KIMMO 
3.1. Lexicon 

As in other implementations of two-level morphology, a lexicon is 
represented in the form of a letter tree in order to gain efficient lexical 



md Koskenniemi (1983: sec. 2.3.9), this operator is interpreted as the combination of the 
two operators, namely, --> and <— which means "only if 1 and "if 1 , respectively. 
^These abbreviatory conventions make the two-level rules of a language and the corre- 
sponding finite state automata more compact and easy to read. See section 3.2. 



access. 3 For example, an English lexicon that contains the words be, beer, 
believe, big, and boy is roughly represented as follows: 



e - r 

/ 

e-l-i-e-v-e 

/ 

b - i - g 

\ 

o - y 

Figure 3.1: The Lexical Tree 

The last character of each word in the tree is associated with lexical entries. 
The b-e-e-r branch, for example, carries the entry for be at the e and the 
entry for beer at the r. b and the third e do not have any lexical specifications 
because b and bee are not words in this sample lexicon. In the current version 
of P-KIMMO, a lexical entry is a list consisting of a continuation class and a 
feature description. The empty list symbol [] is used for indicating characters 
devoid of lexical entries. Figure.3.2 shows the actual machine-readable format 
of the lexical tree in figure 3.1. Notice that the copular verb be has multiple 
entries. 

[("b", [], 

[("e", [[#, "AUX"]/ [ivl, •"■]]/ 
[("e", [], 

[("r", [[n, ""]], [])])/ 
("1", []/ 

[("i", []/ 

[("e", [], 

[(V, [], 

[("e", [[V/ »»]], [])])])])])]), 

("i", []/ 

[("g", [[a, ""]], [])])/ 
("o", [], 

[py", [[n, ,,,( ]]/ [!)])])] 

Figure 3.2: 

Pretty-printed list representation of the lexical tree 

alternation^, [ca, cs] ) . 
alternation (ivl, [pr, i, ag, ab] ) . 

Figure 3.3: 

Possible expansions of the continuation classes "a" and "ivl" 

The symbols #,/vi, n, v, and a are continuation classes which allow the 
recognizer to select the possible affixes. For example, the continuation class a 
has the comparative (ca) +er and superlative (cs) +est as its members, as shown 
in figure 3.3, which make possible to analyze words like bigger and biggest. # 



3 Although it is common practice to represent lexicons as lexical trees, it is controversial 
whether the tree representation is appropriate for modelling human performance. See 
Forster (1976) for some arguments against lexical trees'* from psycholinguistic perspec- 
tives. 
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indicates termination, so no continuation is permitted. This shows that the 
continuation of a lexical formative is specified in its lexical entry, and thus 
how morphotactics is described in the two-level formalism. 

Although the tree format increases the efficiency of lexical access, it is 
very laborious and error-making to encode a lexical tree by hand, since it 
requires extremely careful arrangements of parentheses and brackets. 
Indentation for increasing readability would also be a tedious job. The prob- 
lem would be much more serious if one wanted to build a large lexicon. It is 
almost impossible for the human eye to trace down the number of parentheses 
and brackets needed to properly enclose a big lexical tree. This bulkiness of 
the lexical tree also makes it difficult to augment the lexicon with new words,. 

As in KIMMO, a lexicon compiler has been added to the P-KIMMO system 
which automatically builds the corresponding lexical tree from an easy- 
formatted dictionary (called "EZ- Lexicon"). An EZ-Lexicon consists of Prolog 
clauses each of which contains a lexical string and information relevant to 
that lexical item (i.e. continuation class and features). The EZ-Lexicon corre- 
sponding to the English lexical tree above is given in figure 3.4. 

lexicon(root, "be" , [ [#, "AUX"] , [ivl, ""]]). 

lexicon(root, "beer", [ [n, ""]]). 

lexicon (root, "believe", [[v, ""]]). 

lexicon (root, "big" , [ [a, ""]]). 

lexicon(root, "boy" , [ [n, ""]]). 

Figure 3.4 A Sample EZ-Lexicon of English 

The tree-building program reads an EZ-Lexicon file and writes the 
corresponding lexical tree to a designated output file. The lexical tree is also 
saved in the pretty -printed format as in figure 3.2. 

There is another motivation for which the lexicon compiler has been 
developed. As mentioned in section 1, an effort is being made to augment P- 
KIMMO by making it capable of doing morphological analysis with an on-line 
dictionary. It is very unlikely, however, that on-line dictionaries are orga- 
nized in such a way that P-KIMMO can make direct use of them. One way to 
make P-KIMMO run on such a dictionary would be to modify the recognition 
algorithm so that it could consult the original dictionary format. Above all, 
this obviously would lead to the loss of efficient lexical access. Since on-line 
dictionaries are usually huge in size, it is not hard to imagine that the process 
time would increase significantly. For this reason, the preprocessing of a 
dictionary— i.e. building the lexical tree from a dictionary— has been chosen to 
achieve the goal. 



3,2. Rules As Finite State Automata 

The most innovative feature of Koskenniemi's model is the use of finite 
state automata to encode two-level rules. 4 This is why two-level morphology is 
often referred to as "finite state morphology 1 '. The utilization of finite state 
automata explains why the processor is. so efficient. It is well-known that 
finite state machines are computationally efficient and easy to implement. 
The finite state transducer behaves in exactly the same way as the ordinary 



4 Precisely speaking, finite state "transducers'* in the sense that the input symbols are a 
pair of characters, rather than a single symbol. 



finite automaton except that it reads a pair of input symbols. As shown in 
figure 3.4, a pair of characters is the input to the transducer, which is cur- 
rently scanning the "yi" pair. 
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upper tape 



lower tape 



Figure 3.4 



The English rule that expresses the y/i alternation can be depicted by a tran- 
sition network diagram. Figure 3.5 is the graphical representation of the Y- 
spelling rule for English given in Karttunen and Wittenburg (1983). 




Figure 3.5 

On the implementational side, finite state automata arc represented as 
state transition tables. Figure 3.6 shows the tabular form of the Y-spelling 
rule. The full-fledged machine is presented in the appendix. The internal 
structure of the automata in P-KIMMO is somewhat different from the one in 
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KIMMO (i.e. the LISP version). It is also slightly different from Boisen's Pro- 
KIMMO. 



[ (y_spelling, [true, true, false, false, true, false] ) , 
("CC", [2,2,0,1,1,0]), 

( "yy" / [1,5,0,1,1,0]), 

<"yi", [0,3,0,0,0,0]), 

<"+=", [1,1,4,1,6,0]), 

<"ii", [1,1,0,0,1,1]), 

("aa", [1,1,0,0,1,1]), 

< [1,1,0,1,1,0])] 

Figure 3.6: Y-spelling automaton for English 

A transducer is encoded as a list whose member in turn is a LISP-like list con- 
sisting of a character pair and a transition list, except that the first member 
has the rule name (i.e. y_spelling) in the place of a character pair. "true" 
indicates a final state. The Y-spelling transducer has states 1, 2, and 5 as its 
final states. 0 means failure. An outgoing state is implicit in .the sequential 
order of a state list, and an incoming state is represented as a member of the 
transition list. Taking the "yy" pair as an example, if the outgoing state is 2, 
the incoming state is set to state 5. The pair "CC" is an abbreviation for all the 
consonant pairs that .are not "specified" in the automaton. Thus, the "CC" pair 
in the Y-spelling automaton stands for all the consonant pairs except for "yy". 
Theoretically, "CC" includes any possible combination of consonants such as 
"fv", "zs", "bp", and so on. The abbreviatory conventions are not interpreted 
that way, however. Their interpretation varies depending on the rules of a 
language. For example, the schematic pair "CC" includes a pair like "fv", only 
if it is "specified" in some other rule. The symbol "=" could be thought of as a 
wildcard which represents any characters. It roughly corresponds to 
"elsewhere condition" in phonological terms. Taking the Y-spelling rule 
again as an example, "==" stands for all the pairs other than the pairs specified 
and subsumed by more "specific" schemata (in this particular case, "+»" and 
"CC") in the automaton. This "specificity hierarchy" is another crucial factor 
to interpret two-level rules/automata. Since these notational conventions are 
very important to understand the two-level formalism, I refer the reader to 
Koskenniemi (1983) and Karttunen (1983) for detailed description. 

As in Karttunen and Wittenburg (1983), each automaton in P-KIMMO 
has been hand-coded, which, of course, is very tedious. Along the lines of 
Koskenniemi (1985), a program that compiles automata directly from two-level 
rules is under development. 



3.3. Compiler 

Although the finite state automata described in the previous section are 
machine-readable, they are not the real data structures that the recognition/ 
generation algorithm of P-KIMMO runs off. In the LISP implementation of 
Gajek et al. (1983), finite state automata are compiled into two data structures 
called R-MACHINE and G-MACHINE which are used for recognition and 
generation, respectively. These data structures enable the recognizer and the 
generator to access finite state automata more efficiently. P-KIMMO basically 
adopts the same idea, but the data structures created by the Prolog version of 
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the compiler are different from those presented in Gajek et aL In P-KIMMO, 
there is no distinction between R- and G-MACHINE. Rather, a single machine 
produced by the P-KIMMO compiler, which I call simply MACHINE, serves as 
both R- and G-MACHINE. The ability to use a single machine for both recogni- 
tion and generation is due to Prolog's inherent aspect, namely, 
"reversibility". 5 The availability of a single machine would make the system 
more compact, since we do not have to maintain two data structures that are 
rtructurally identical except that one is accessed by the recognition algorithm 
and the other by the generation algorithm. 

In P-KIMMO, MACHINE can be either asserted in the dynamic database 
or saved as a file by the compiler. Saving MACHINE as a database file would be 
a better choice if the user works with a complete set of automata. The recog- 
nition/generation algorithm runs much faster with a separate MACHINE file. 
This is due to the fact that using "pure" data structures generally increases the 
efficiency of a program. Adding MACHINE to the database would be useful if 
one still needs to debug the finite state machine he is working on. Otherwise, 
one has to save MACHINE each time he wants to test it. A fragment of MACHINE 
for English is given in figure 3.7. 

machine* "aa", [[ (1,1) ] , 

[(1,1), (5,1), (6,1), (7,1)], 

[(1,2), (2,1), (3,1), (5,1), (7,1), (8,1), 

(11,1), (14,1), (15,1)], 
[(1,1), (2,1), (3,1), (4,1), (6,1)] , 

[(1,4), (2,1), (4, 16), (5, 16), (6, 16), (7, 16), (8, 16), (9, 16), 
(10,16) , (11,16) , (12,16) , (13,16) , (14,16) , (15,16), (16,16) ] , 

[(1,1), (2,1), (5,1), (6,1)]]). 
machine( u bb", [[(1,1)] , 

[(1,1), (5,1), (6,1), (7,1)], 

[(1,5), (2,1), (3,5), (5,1), (7,1), (9,1), 
(11,1), (14,1), (15,1)], 

[(1,1), (2,1), (3,1), (4,1), (6,1)], 

[(1,1), (4,5), (5, 16), (6, 16), (7, 16), (8, 16), (9, 16), (10, 16), 

(11,16), (12, 16), (13, 16), (14, 16), (15, 16), (16,16) ] , 
[(1,2), (2, 2), (4,1), (5,1)]]). 

Figure 3.7 

A machine clause carries two arguments: a pair of characters (lexical and 
surface in that order) and a list whose member is again a LISP-like list consist- 
ing of the outgoing and incoming state. Each state list of a machine clause 
specifies the possible states the current character pair could go through. Each 
machine clause in figure 3.7 has six state lists which correspond to six 
morphophonological rules of English (see Appendix). An important thing to 
note is that when the recognizer or generator is invoked, all the six state lists 
are checked to see if the character pair at hand is licensed by them. If any 
one of the state lists blocks the pair, then the process will fail. Although all 
the state lists (i.e. all the rules) are checked when MACHINE receives an input 
pair, this is done in a serial way (cf. Karttunen 1983, section 4.1). This is why 
Karttunen (1983) and Gajek et aL (1983) mention the merge of separate 
machines into a single finite state machine (called BIGMACHINE) which makes 



5 This non-deterministic programming technique is also closely related to the recogni- 
tion/generation algorithm. See section 3.4 below. 
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the process more efficient. Karttunen (1983) gives an algorithm to merge 
transducers into a single equivalent transducer. The program merging trans- 
ducers is also currently under development. 



3.4. Recognizer and Generator 

The recognition/generation algorithm is the workhorse of P-KIMMO. 
The reader interested in the algorithm itself is referred to Koskenniemi (1983) 
and Karttunen (1983), though they are not easy to follow at all The recogni- 
tion/generation algorithm of P-KIMMO is close to the one described in 
Karttunen (1983) in that it adopts the "depth-first" control strategy. In 
Koskenniemi's (1983) original implementation, the algorithm operates in the 
"breadth-first" manner. In this section, I will briefly mention a characteristic 
of the recognition/generation algorithm which distinguishes P-KIMMO from 
the previous implementations. 

The previous implementations of the two-level model usually have two 
separate routines for recognition and generation. The main difference 
between the two is that the former is driven by the lexicon, whereas the latter 
is not. In P-KIMMO, there is no distinction between the recognition and 
generation algorithm. What this means is that a single algorithm serves for 
both recognition and generation. A single routine for both recognition and 
generation is the result of taking advantage of the non-deterministic 
programming technique of Prolog. In a nutshell, recognition is nothing more 
than the "reverse" mode of generation, and vice versa. A consequence of 
using a single algorithm is that the lexicon is also consulted during genera- 
tion. This prevents the generator from accepting garbage inputs— i.e. non- 
words or combination of non-words. 6 If the generator is not guided by the 
lexicon, every garbage input would be accepted as long as it satisfied the 
morphophonologicai conditions of the language. In some domains of applica- 
tion, however, it is quite plausible that one wants to use the system to parse 
non-words. It would be especially useful when one wished to store unknown 
words to augment the existing lexicon. As a matter of fact, P-KIMMO includes a 
separate generation routine for these purposes, which makes P-KIMMO more 
flexible, although it is not listed in the appendices. 



4. Evaluation of P-KIMMO 

Implemented in Prolog, it is quite natural that P-KIMMO and Boisen's 
Pro-KIMMO have many things in common. For example, The lexical format of 
the two systems is strictly identical. The way to encode finite state automata is 
also very similar. Nevertheless, the two implementations are substantially 
different at least in one respect. This section briefly discusses the crucial 
difference which, I believe, renders P-KIMMO superior to Pro-KIMMO. 

Although Boisen (1988) alludes to creating new data structures out of 
finite state automata, he does not spell out what type of data structures his 
recognition algorithm uses. It seems obvious, however, that his algorithm 
does not make use of data structures of the kind described in section 3.3. I 
strongly believe that this is why it took his recognizer more than a minute to 



^By "non-words", I mean strings that are not listed in the dictionary. 

ERIC 



analyze the Japanese word kattemitaJ Surprisingly enough, P-KIMMO 
consumed only 0.3 second to recognize the same word, which is significantly 
fast, compared to Pro-KIMMO. Of course, the speed heavily depends on the 
computer used for the test. The CPU time consumed by P-KIMMO to recognize 
kattemita has been calculated on a Sun Workstation which is quite fast. 
However, the result of running P-KIMMO on a modest Macintosh S E still 
proves that Pro-KIMMO is painfully slow. It didn't take more than a couple of 
seconds to process the same word on an SE. The data structures are not the 
only factor that slows things down.** As a matter of fact, Boisen attributes Pro- 
KIMMO's inefficiency to possible continuation classes the recognizer has to go 
through. 9 Given the same complexity caused by continuation classes, how- 
ever, the unrealistic speed of Pro-KIMMO should be explained otherwise. This 
is why I believe that everything else being equal, P-KIMMO's superiority over 
Pro-KIMMO is due to the optimized data structures. 1 ^ 

Quite recently, a C implementation of the two level model for personal 
computers, thus dubbed PC-KIMMO (version 1.0.3) has been made available. 
Roughly speaking, the system structure of PC-KIMMO is almost identical with 
P-KIMMO except that it was written in C. Thus, it is not possible to compare the 
two systems in terms of control strategy, data structures, and so on. I will only 
mention some timing results from testing the two systems. To test PC-KIMMO, 
the C source code was compiled on a Unix machine. The tuning has been done 
on a Sun workstation. In the recognition mode, PC-KIMMO is slightly faster 
(but not always! For some test inputs like dying and spies y P-KIMMO was faster 
by O.OOx) than P-KIMMO by O.Ox second or O.OOx (x usually ranges from 1 to 5), 
while in the generation mode, P-KIMMO (unexpectedly) performs better by 
the same degree. 

Given that P-KIMMO is a little slower (but not significantly) than PC- 
KIMMO in recognizing words, the question is whether there is a way to im- 
prove the recognition speed of P-KIMMO. Since Prolog lacks data types such as 
arrays in conventional programming languages, the consulting time of state 
transition tables grows linearly to the size of the tables. This goes against the 
spirit of the two-level model in which the complexity of rules does not have 
any significant effect on processing time (cf. Karttunen 1983). A lot of 
attempts have been made to improve this defect of Prolog by simulating data 
types such as hash tables (cf. O'Keefe 1990). I believe that further optimization 
of data structures (e.g. The conversion of state transition lists into Prolog 
terms would make it possible to pick the desired state transition immediately by 
its argument position.), which is currently being under study, could improve 
the performance of P-KIMMO. However, even with somewhat defective data 
structures, P-KIMMO is efficient enough to compete with rapid systems like PC- 
KIMMO. 



7 As in Boisen (1988), I assume the two-level description of Japanese in Alam (1983). 
^There are several overheads for the two-level model in general. For example, the traver- 
sal of the lexical tree is futile in many cases. Consult Barton et al. (1985) for the general 
discussion of problems with the two-level formalism. 

9 See section 3.2.3 of Boisen (1988) for Prolog-related problems in implementing two- level 
morphology. 

l^Since Boisen (1988) lacks the description of the implementational aspects of his sys- 
tem, the comparison given here cannot be considered as empirical results. 
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5. Using P-KIMMO 



5.1 Running P-KIMMO on the UNIX computer 

Before running the recognizer and generator, the user needs to convert 
the finite state automata of a specific language into the suitable data structures 
(i.e. MACHINE). To compile automata, type "compile.". Then, the user is 
prompted to enter the file name that contains the finite state automata of the 
language the user has in mind. 

| ?- compile . 

Input filename? 
| : 'english.aut' . 

Output filename? 
|: 'machine . eng ' . 

figure 5.1 

To call up the system, all the user has to do is just to type "kimmo." at the 
prompt (Be sure not to omit the period!). That command will automatically load 
all the relevant files. Then, the user is again prompted to enter the data files 
(i.e. the MACHINE and the lexicon) he wants to examine. 

|?- kimmo. 

Which machine? 
| : ' machine . eng ' . 

Which dictionary? 
| : ' english . lex ' . 



Welcome to P- KIMMO ! ! ! 

Copyright (C) 1991 by Kang-Hyuk Lee. All rights reserved. 

Figure 5.2 

Now, P-KIMMO is ready to run. Figure 5.3 and 5.4 show the sample inputs and 
outputs for recognition and generation, respectively. 

| ?- recognizee "dying" ) . 

Recognized string: die+ing 
Categories : [ root , pr ] 

Feature(s) : V PROG 

RECOGNITION TIME =0.05 sec. 

yes 
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I ? - recognize ( " died " ) . 



Recognized string: die+ed 
Categories : [ root , ps ] 

Feature ( s ) : V PAST 

Recognized string: die+ed 
Categories : [root/pp] 
Feature(s): V PAST PRT 

RECOGNITION TIME =» 0.117 sec. 

yes 

] ?- recognizee "referring" ) . 

Recognized string: re**fer+ing 
Categories : [ root , pr ] 

Feature ( s ) : V PROG 

RECOGNITION TIME =0.1 sec. 

yes 

Figure 5.3 

| ? - generate ( " die+ing " ) . 
Generated String: dying 
GENERATION TIME « 0.183 sec. 
yes 

j ? - generate ( " die+ed M ) . 
Generated String: died 
GENERATION TIME - 0.033 sec. 

Tes 
?- generate ( "re"fer+ing" ) . 

Generated String: referring 

GENERATION TIME = 0.0669999 sec. 

yes 

Figure 5.4 

When the recognizer is invoked, it displays the lexical string of the input, the 
category names of the elements involved during recognition, features of these 
categories, and finally the CPU time for recognition. For example, the surface 
string dying is analyzed as combining the root category die with the mor- 
pheme (i.e. "pr") ing ("+" indicates morpheme boundary). "V" and "PROG" 
stand for "verb" and "progressive", respectively. The generator only returns 
the surface string of the input and generation time. 

As mentioned in the previous section, another mode of P-KIMMO is 
available which is suitable for the development of a new machine and die- 
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tionary. The tracing facility is also being implemented for debugging. This 
would allow the user to detect errors more easily. 



5.2 Running P-KIMMO on the Macintosh computer 

Though not a stand-alone application, the Macintosh version of P- 
KIMMO provides the user with a menu-driven interface. Upon opening the 
file named "MacKIMMO", 1 1 the user is given the screen with the Output 
Window in figure 5.1. 




Figure 5.1 



The last three menus in the menu bar--"Compile", "Recognizer", and 
"Generator"— are the ones created by MacKIMMO. The addition of these menus 
has led to the suppression of other built-in menus. First, click on the 
"Compile" pull-down menu which currently contains two data types— 
"Engdata" (for English) and "Nippondata" (for Japanese). Figure 5.2 shows that 
"Engdata" is selected from the "Compile" menu, which leads MacKIMMO to 
compile the English automata and load the English lexicon. After the compila- 
tion is done, a message will appear in the Output Window, as in figure 5.3. 



* MacKIMMO runs on LP A Mac PROLOG 3.0. 
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Compile 
EnqData 



Recognizer Generator 



Figure 5.2 



2 Output UJlndom 



MicoM to fMCimoi 1 1 

Copyright <c> 1991 by Kang-Hy^ L«. Rl I rights rcstrved. 



Figure 5.3 

MacKIMMO is now ready to run. To test the recognizer, select "Surface 
String..." from the "Recognizer" menu (Figure 5.4). Then the user is provided 
with a dialog box for entering a string. On typing a surface string followed by 
either the return key or a click on the OK button, the analyzed result is dis- 
played in the Output Window, as illustrated in figure 5.5. 



Compile 


Recognizer 


Generator 


Surface string... r 




Figure 5.4 





16 




13 



r 4 [jig Ullndoms Fonts Euol Compile Recognizer Generator 



3:53 





Enter a string to recognize 



2 Output Window 



U«lC0M to P-KimO! 

Copgrlght <c> 199 1 by Kang-Hyuk Lt«. R 



R«eognlz«d string: dl«*lng 
Cat*gori«s; (root, prl 

F«atur«<s>: U PROG 




Figure 5.5 

The same goes for the generator. An illustration is given in figures 5.6 

and 5.7. Note that the generated string is displayed above the previously rec- 
ognized string. 



Compile Recognizer I 


Generator BMW 


I'; Lexical string... ;| 






Figure 


5.6 
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4 file Window Font« Euol Compile Recognizer Generator 



3;S5 <» _ 





Enter a string to generate 



die*in$J 



[ Cancel ] 



7. Output UHndou) 



EustcoM to p-Kimom 

Copyright <c> 1991 by Kong-Hyuk Let. fill rights rtstrvtd, 

Owwratftd String: dying 

.ftacogniztd string: dlt+lng 
Categories: [root, prl 

|F«otur«<s): u PROO 

Mo more solutions 




Figure 5.7 
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/********************************************************************** 
** 

** "The Recognition/Generation Algorithm" 

** 

** 

** Copyright (C) 1991 Kang-Hyuk Lee 

** All rights reserved. 

** 

****************************************** 

%%% Unlike previous implementations, there is no algorithmic 
%%% distinction between recognition and generation in the P-KIMMO 
%%% system. This is the result of taking advantage of non- 
%%% deterministic programming technique inherent in Prolog, 

recognize (Surface) : - 

statistics (runtime, _Now) , 

findall( [Lexical, Cat_List, Feature_List] , 

transduce (Lexical, Surface, Cat_List, Feature_List) , 
Analyses), nl, 

( Analyses === [] -> (nl, write( 'No solution Available'), nl) 

| write__results (Analyses) ) , 
statistics (runtime, [_Total, Since] ) , 
Sincel is Since/1000, 
write ( ' RECOGNITION TIME = ' ) , 
write ( Sincel ) , 
write ( * sec. ' ) , nl. 



%%% Note: Since transduce/4 is not embedded in findall/3, the 

%%% generator doesn't backtrack to see if there are more solutions 

%%% as the recognizer does. To get all solutions, just put 

%%% transduce/4 into findall/3 as in recognize/1. 

%%% 

generate (Lexical) : - 

statistics (runtime, _Now) , 

transduce (Lexical, Surface, Cat_List, Feature_List) , 

write ( ' Generated String : ' ) , 

write_output( Surface) , 

statistics (runtime, [_JTotal, Since]), 

Sincel is Since/1000, nl, 

write ( 'GENERATION TIME = '), 

write ( Sincel ) , 

write ( 1 sec. * ) , nl . 



is_final([], []). 

is_final( [Final | Rest] , [FinalList | RestFinals ] ) : - 
member ( Final , FinalList ) , 
is_f inal(Rest, RestFinals) . 

final ( [Final | Rest] ) : - 

f inality ( [FinalList | RestFinals] ) , 

is_final( [Final |Rest] , [FinalList | RestFinals ] ) . 
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writejcesults ( [ ] ) . 

write_results( [ [Lexical, Cat_List, FeatureJList] | Rest_Cat_Features ] ) 
write ( r Recognized String: ' ), 
write_output( Lexical) , nl, 
write ( ' Categories : ' ) , 

write (Cat_List) , nl/ 
write ( ' Feature ( s ) : * ) , 

write_feature(Feature_List) , nl, nl, 
write_results ( Rest_Cat_Features ) . 

write_output ( [ ] ) . 
write__output ( [Char | Rest_Char ] ) : - 
put (Char ) , 

wr ite_putput ( Rest_Char ) . 

write_feature( [] ) . 
write_feature( [Feature | Rest] ) : - 

write_output( Feature) , write ( ' ' ) , 

write__f eature(Rest) . 



%%% transduce (Lexical_String, Surf ace_String, 
%%% List_of_Categories , Features ) 

%%% 

%%% Features : Bundle of features pertaining to categories 

%%% involved in the input string 

%%% 

transduce ( [ Init_Char | RestJLex_Char ] , [ Init_Char | Rest_Surf_Char ] , 
[Cat | Rest_Cat] , Features ) : - 
initials ( State ) , 

lexicon (Cat, [ ( [Init_Char ] , Cont_Inf o, Rest__Char_and_Cont_Info) ] ) , 
move_automata ( State, 

[Init__Char Rest_Lex_Char] , 

[Init__Char Rest_Surf_Char] , 

[ ( [Init__Char] , Cont_Info, Rest_Char_and_Cont_Inf o) ] , 

Rest_Cat, 

Features) . 

transduce (State, [] , [] , [Cont Info | Rest_Cont_Inf o] , [] , [Info| []]):- 
check_cont_list ( [Cont_Inf o"j"Rest_Cont_Inf o] , [ ] , Info, [ ] ) , 
final (State) . 

transduce (State, [Lex_Char] , [Surf_Char] , 

[Cont_Info|Rest_Cont_Info] , [Cat] , [Info| Info2] ) : - 
check_cont_list( [Cont_Inf o | Rest_Cont_Inf o] , Cat, Info, 

[ ( [Lex_Char ] , Cont_Inf o2 , Rest_Char_and_Cont__Inf o) ] ) , 
f ind_arc( [Lex_Char, Surf_Char] , State, State2) , 
transduce (State2, [] , [] , Cont_In£o2, [] , Info2) . 

transduce (State, [Lex_Char | Rest_Lex_Char] , [] , 
[Cont_Inf o | Rest_Cont_Inf o] , 
[Cat |Rest_Cat] , [Inf o | Inf o2 ] ) : - 
check_cont_list ( [Cont_Inf o | Rest_Cont_Inf o] , Cat, Info, 

[ ( [Lex_Char ] , Cont_Inf o2 , [])]), 
find_arc( [Lex_Char,0] , State, State2), 

transduce ( State2, Rest_Lex_Char , [] , Cont_Info2, Rest Cat, Info2). 
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%%% transduce ( Current_State , Lexical, Surface, 
%%% Lexicon, Categories, Features) 

%%% 

%%% Lexicon: New lexical configuration to be used to process 

%%% the rest of the input string 

%%% - 

transduce ( State , 

[Lex_Char | Rest_Lex_Char ] , 
[Surf_Char Rest_Surf_Char] , 
[Cont_Info Rest_Cont_Info] , 
[Cat | Rest_Cat] , 
[Info | Info2] ) : - 
check_cont_list( [Cont_Inf o | Rest_Cont_Inf o] , Cat, Info, 

[ ( [Lex_Char ] , Cont__Inf o2 , Rest_Char_and_Cont_Inf o) ] ) , 
move_automata ( State , 

[Lex_Char | Rest_Lex_Char ] , 
[Surf_Char | Rest_Surf_Char] , 

[ ( [Lex_Char] , Cont_Inf o2 , Rest_Char_and_Cont_Inf o) ] , 

Rest__Cat, 

Info2) . 



%%% move_automata(Current_State, Lexical_String, Surf ace_String, 

%%% Lexicon, Category, Info) . 

%%% 

%%% Lexicon: current configuration of the lexicon 

%%% Category: list of categories 

%%% Info: Grammatical information 

%%% - - 

move_automata ( Statel , 

[Lex_Char | Rest_Lex_Char ] , 

[Surf_Char | Rest_Surf_Char] , 

Lexicon, 

Cat, 

Info):- 

lexmatch( [Lex_Char] , Entry, 

[ ( [Lex_Char] , Cont_Inf o, Rest_Char_and_Cont_Inf o) ] ) , 
f ind_arc( [Lex_Char , Surf_Char] , Statel, State2) , 

% if Lex_Char does not have an entry 
(Cont_Infc ===== [] 
% process the next character pair 
- > move_automata ( State2 , 

Rest_Lex_Char , 

Re s t_J3ur f _C ha r , 

Rest_Char_and_Cont_Inf o , 

Cat, 

Info) 

% otherwise, i.e. if Lex_Char has an entry 

% either do transduce/6 
; ( transduce ( State2 , Rest_Lex_Char , 

Rest_Surf _Char , 

Cont_Info, Cat, Info) ; 

% if transduce/6 fails, go on to process the next pair. 
% 
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% Even if a lexical character has an entry, that doesn't 
% necessarily mean that the lexical string scanned by that 
% time consititutes part of the input string. For example, 
% consider the word "bite" . After scanning "t" which has an 
% entry, the parser will try to match the final character "e" 
% with some character by looking up in another lexical item, 
% which eventually fails. Therefore, the parser needs to 
% backtrack in order to analyze "bite" as a single word. 
move_automata ( State2 , 

Rest_Lex_Char , 

Rest_Surf_Char, 

Rest_Char_and_Cont_Inf o , 

Cat, 

Info))) . 

move_automata ( Statel , 

[Lex_Char | Rest_Lex_Char] , 

[Surf _Char | Rest_Surf_Char] , 

Entry, 

Cat, 

Info) :- 

lexmatch( [Lex_Charl , Entry, 

[ ( [Lex_Char] , Cont_Inf o, Rest_Char_and_Cont__Inf o) ] ) , 
find_arc( [Lex_Char, 0 ] , Statel, State2), % e.g. "+0", "^0" 
(Cont_Info == [] 
- > move_automata ( State 2 , 

Rest_Lex_Char , 

[Surf_ Char | Rest_Surf_Char] , 

Rest_Char_and_Cont_Inf o , 

Cat, 

Info) 

; ( transduce ( State2 , Rest_Lex_Char , 

t Surf _Char | Rest_Sur f _Char ] , 
Cont_Info, Cat, Info) ; 
move_automata ( State 2 , 

Rest_Lex_Char , 

[Surf_Char | Rest_Surf_Char] , 

Rest_Char_and_Cont_Info, 

Cat, 

Info))) . 



lexmatch ( Lex_Char , 

[ (Lex_Char,Cont_Info,Rest_Char_and_Cont_Info) |_] , 
[ (Lex_Char,Cont_Info,Rest_Char_and_Cont_Info) ] ) . 

lexmatch ( Lex_Char , 

[ | OtherJLex_Char ] , 

[ (Lex_Char,Cont_Info,Rest_Char_and_Cont_Info) ] ) : - 
lexmatch ( Lex_Char , 

Other_Lex_Char , 

t (Lex_Char, Cont_Inf o, Rest_Char_and_Cont_Inf o) ] ) . 
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%%% Find transitions whose arc is labelled this pair 
%%% 

%%% f ind_arc(Pair, State_Listl, State_List2) 
%%% State_Listl: list of outgoing states 
%%% State_JList2 : list of incoming states 
%%% 

find_arc( [Lex_Char,Surf_Char] , [Statel | Restl] , [State2 | Rest2 ] ) : - 
machine ( [Lex_Char, Surf _Char] , List of_Lists), 
all_member ( [Statel | Restl] , [State2jRest2 ] , List_pf_Lists ) . 



%%% all_member(SLl, SL2, LSL): finds a transition rule by rule 
%%% SL1: list of outgoing states 

%%% SL2: list of incoming states 

%%% LSL: list of state lists 

%%% The number of LSL corresponds to that of 

%% % morphological rules. 

%%% _ 

all_member( [] , [] , [] ) . 

all_member( [Statel | Restl] , [State2 | Rest2 ] , [List | Res tList] ) : - 
member ( (Statel, State2), List), 
all_member( Restl, Rest2 , RestList) . 



check_cont_list( [ [Cont, Info] |_Rest_Cont] , Cat, Info, [] ) : - 

{ Cont # ) . % end of string 

check_cont_list( [ [Cont, Inf o] | JRest_Cont] , Cat, Info, 

[ ( [Lex_Char] , Cont_Inf o, Rest_Char_and_Cont_Inf o) ] ) : - 

lexicon (Cat, [ ( [Lex_Char] , Cont_Info, Rest_Char_and_Cont_Info) ] ) , 

check_cont(Cont, Cat) . 

%%% For a lexical items with mutiple entries 
check_cont_list( [_Cont | Rest_Cont] , Cat, Info, 

[ ( [Lex__Char] , ContJEnf o, Rest_Char__and__Cont__Inf o) ] ) : - 
check_cont_list(Rest_Cont, Cat, Info, 

[ ( [Lex_Char] , ContJEnf o, Rest_Char_and_Cont_lnf o) ] ) . 

check__cont(Cont, Cat) : - 

alternation (Cont, List_of_Alt) , 
member (Cat, List_of_Alt) . 
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/********************************************************************** 
** 



** 
** 
** 
** 
** 
** 
** 

************* 



"MACHINE Compiler" 



Copyright (C) 1991 Kang-Hyuk Lee 
All rights reserved. 



*********************************************************/ 



%%% The compiler converts automata (i.e transition tables that encode 
%%% morphophonological rules) into different data structures, which 
%%% the recognition-generation routine utilizes. 
%%% The basic idea can be found in Gajek et al.(1983). 

%%% Unlike Gajek et al., this compiler does not generate g-machine and 
%%% r-machine. The single machine serves as both g- and r-machine. 



initial(l) . 



compile : - 

write ( ' input file name? ' ) , 

read(Filel) , 

compile (Filel) , 

write (' output file name? '), 

read(File2) , 

tell(File2) , 

create_lit_pairs ( All_Lit_Pairs ) , 
assert ( all_lit_pairs ( All_Lit_Pairs ) ) , 

automata( [ [ (RuleName, FinalStateList) | RestAutomaton] | RestAutomata] ) , 
create_initials ( [ [ ( RuleName, FinalStateList) 
* | RestAutomaton] | RestAutomata] , 

Initials) , 

write ( in? tials( Initials) ) , write ('.'), nl, 
compile_f inal( [ [ (RuleName, FinalStateList) 

| RestAutomaton] | RestAutomata] , 
FinalStateList, FinalList, AllFinals ) , 
write (finality(AllFinals) ) , write ( ' . ' ) , nl, 
change_automata ( [ [ (RuleName, FinalStateList) 

| RestAutomaton] (RestAutomata], 
[ [ (RuleName, FinalStateList) 

| NewRestAutomaton ] | NewRestAutomata ] ) , 
make_machine( [ [ (RuleName, FinalStateList) 

| NewRestAutomaton] | NewRestAutomata ] , 
[FirstPair | RestPairs] ) , 
assert_automata ( [ FirstPair | RestPairs ] ) , 
told. 
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%%% to create all possible character pairs of a given language 

%%% They are used to instantiate schematic pairs such as "CC n and "==" 

%%% 

create_lit_pairs(All_Lit_Pairs) : - 
alphabet (Alphabet) , 

make_alphabet_pairs (Alphabet, Alphabet_Pairs ) , 
automata ( [Automaton | RestAutomata] ) , 

create_lit_pairs ( RestAutomata, Rest_Pairs, Alphabet_Pairs ) , 
append (Alphabet_Pairs, Rest_Pairs, All_Lit_Pairs ) . 



%%% to produce all the "xx" character pairs from the alphabet of 
%%% a language 

%%% 

make_alphabet_pairs ( [ ] , [ ] ) . 
make_alphabet_pairs( [ [X] j Y] , [ [X,X] |z] ) : - 
make_alphabetjpairs ( Y, Z) . 

%%% to produce the character pairs that are not the "xx" type but 
%%% specified in the automata 

%%% 

create_lit_pairs ( [ ] , [ ] , Aiready_Pairs ) . 

create_lit_j>airs( [ [ ( RN, FSL) | Automaton] | RestAutomata] , 
Conc_Pairs, Alphabet_Pairs ) : - 
pickjpair s ( Automaton , New_Pair s , Alphabet_Pairs ) , 
append (Alphabet_Pairs, New_Pairs, Already_Pairs ) , 
create_lit_pairs ( RestAutomata , Rest_Pairs , Already_Pairs ) , 
append (New_Pairs, RestJPairs, Conc_Pairs). 

pick_pairs ( [ ] , [ ] , Already_Pairs ) . 
pick_pairs( [ (Lit_Pair, StateList) | Rest_Pairs] , 
NewPairs, Already_Pairs ) : - 

% if the pair is a member of the list of pairn that was 

% created from the alphabet, skip it. 

member (Li t_Pair, Already_Pairs ) , !, 

pickjpairs ( Rest_Pairs , NewPairs , Already_Pairs ) . 
pick_pairs( [ ( [Lex, Surf] , StateList) | Rest_Pairs ] , 
NewPairs, Already_Pairs ) : - 

% if the pair has either wildcard or a va^.able, 

% skip it. 

( ( (alphabet (any, [Lex] ) ; 
alphabet (any, [Surf] ) ) ; 
abbrev( [Lex] , Alphabet 1) ) ; 
abbrev( [Surf ] , Alphabet2 ) ) , !, 
pick_pairs ( Rest_Pairs, NewPairs, Already_Pairs ) . 
pick_pairs( [(Lit_Pair, StateList) |Rest_Pairs] , 

[NewPair | RestNewPairs ] , Al ready _Pairs ) : - 
% otherwise, add this pair to the list of character pairs. 
Lit_Pair = NewPair, 

pick_pairs (Rest_Pairs, RestNewPairs, Already_Pairs ) . 



%%% to create the initio list [1,1,1,...] whose length corresponds 
%%% to the number of aut ^ata 

%%% - 

create_initials ( [ ] , [ ] ) , 

create_initials ( [Automaton j RestAutomata] , [1 1 Rest] ) : - 
create_initials ( Rest^u >mata , Rest ) . 
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%%% to produce the final states of each automaton 

%%% 

compile_f inal ( [ ] , [ ] , [ ] , [ ] ) . 

compile_f inal( [ [ (RuleName, B'inalStateList) | RestAutomaton] 
FinalStateList, [Statel | RestFinal] , [Finals 
initial (Statel) , 

compile_finality( FinalStateList, [ Statel | RestFinal] , Finals) , 
compile_f inal ( RestAutomata, FinalStateList2 , 
Pos Num_List, RestFinals). 



RestAutomata] , 
RestFinals] ) : - 



compile_f inality ( [State] , [FinalState] , [X] ) : - 
State «■» true, 
X « FinalState. 
compile_f inality ( [State] , [FinalState] , [] ) : - 

State == false. 
compile_f inality ( [State | RestState] , 

[FinalStatel| [FinalState2 (RestFinal] ] , 
[x|Y]):- 

FinalState2 is FinalStatel + 1, 
State ««■ true, 
X « FinalStatel, 

compile_f inality (RestState, [FinalState2 | RestFinal] , Y) 
compile_f inality ( [State | RestState] , 

[FinalStatel) [FinalState2 | RestFinal] ] , 
X) :- 

FinalState2 is FinalStatel + 1, 
State — false, 

compile_f inality ( RestState, [FinalState2 | RestFinal] ,X) 



%%% Bach automaton is processed by the recursive call of 
%%% change_automata/2 . All the schematic pairs such as "««" are 
%%% instantiated by change_automaton/2 which calls the relevant 
%%% clause (elsewhere/3 or replace_variable/3 ) depending on the 
%%% pair to be processed. 

%%% - 

change_automata ( [ ] , [ ] ) . 
change_automata ( [ [ ( RN, FSL ) 
[[(RN, FLS) 



RestAutomaton] | RestAutomata] , 
SortedAutomaton] | NewRestAutomata] ) : - 



change_automaton ( RestAutomaton , NewRestAutomaton , 

[ (RN, FSL) | RestAutomaton] ) , 
mergesort (NewRestAutomaton, SortedAutomaton) , 
change_automata ( RestAutomata , NewRestAutomata ) . 



%%% to produces the final data structures (i.e. MACHINE) that the 

%%% recognition/generation algorithm runs off. 

%%% The description of MACHINE is given in section 3.3. 

%%% 

make_machine( [ [ ( RN, FSL) ] | Re s t_RN_And_F S L ] , [ ] ) . 
make_machine( [ [ ( RN, FSL) | [ (Lit_Pair, X) | Rest] ] | RestAutomata] , 
[(Lit_Pair, [x|Y] ) |RestPairs] ): - 
ma]ce_machinel( [ [ (RN, FSL) | [ rLit_Pair, X) | Rest] ] | RestAutomata] , 

(Lit^Pair, [x|Y])), 
remove_pair( [ [ (RN, FSL) [(Lit Pair, X)|Rest]] | RestAutomata] , 

[ [ (RN, FSL) Rest] "[NewRestAutomata] ) , 
make_machine ( [ [ (RN, FSL) |Rest] (NewRestAutomata] , RestPairs) . 
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make_machinel ( [ ] , ( Lit_Pair , [ ] ) ) . 

maJce_machinel( [ [ (RN, FSL) | [ (LitJPair, X) |Rest] ] | RestAutcmata] , 
(Lit_Pair, [X| Y] ) ) : - 
maJce_machinel(Re3tAutomata, (Lit_Pair, Y) ) . 

remove_pair( [ ] , [] ) . 

remove_pair( [ [ (RN, FSL) | [ (LitJ>air, X) |Rest] ] | RestAutomata] , 
[ [ ( RN, FSL) | Rest] | NewRestAutomata] ) : - 
remove_pair ( RestAutomata , NewRestAutomata ) . 

assert_automata( [] ) . 

assert_automata( [ (Lit_Pair, StateList) | RestPairs ] ) : - 

write (machine (Lit_Pair, StateList) ) , write ( * . 1 ) , nl, 
assert_automata( RestPairs) . 



change_automaton( [ J , [ ] , Automaton) . 
change_automaton ( [ ( " , StateList) ] , 

[ (Else_Lit_Pair, Arcs) | RestArcs] , Automaton) : - 
elsewhere ( ( " — " , StateList) , Automaton, 

[(Else_Lit_Pair, StateList) [RestPairs] ) , 
else_list_arcs ( [ (ElseJQit_Pair, StateList) (RestPairs] , 

[ (Else_Lit_Pair, Arcs) |RestArcs]), 
change_automaton ( [ ] , [ ] , Automaton) . 
change_automaton( [ (Var_Pair, StateList) |Rest] , All, Automatonl) : - 
replace_variable( (Var_Pair, StateList), Automatonl, 

[ (Replaced_Pair, StateList) (RestPairs] ) , 
else_list_arcs( [ (Replaced_Pair, StateList) (RestPairs] , 
[ (Replaced_Pair, Arcs) (RestArcs] ) , 

% to update the automaton to prevent the Elsewhere condition 
% from being applied to the instantiated literal pairs 
delete_pair( Automatonl, (Var_Pair, StaceList), Automaton2), 
append ( Automaton2 , [ ( Replaced_Pair , StateList ) ( RestPairs ] , 
Automaton3 ) , 

change_automaton(Rest, Rest Lit_Pairs, Automaton3 ) , 

conc( [ ( Replaced_Pair, Arcs ) jRestArcs] , Rest_Lit_Pairs , All). 

change_automaton ( [ ( Lit_Pair , [ State2 ] ) | Rest ] , 

[ (LitJPair, [ (Statel,State2) ] ) | RestArcs] , Automaton) 
initial ( Statel ) , 

change_automaton(Rest, RestArcs , Automaton) . 

change_automaton( [ (Lit_Pair, [State2 | [State4 | RestStates] ] ) | Rest] , 
[ ( Lit_Pair , Arcs ) | NewRest ] , Automaton ) : - 
initial (Statel) , 

list_arcs( Statel, [State2| [State4 | RestStates ] ] , Arcs) , 
change_automaton(Rest, NewRest, Automaton) . 



else_list_arcs ( [ ] , [ ] ) . 

else_list_arcs ( [ ( Else_Lit_Pair , StateList) | Rest] , 
[ (Else_Lit_Pair, Arcs) (RestArcs]) 
initial ( Statel ) , 

list_arcs ( Statel , StateList, Arcs ) , 
else_list_arcs ( Rest, RestArcs ) . 
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%%% As described in section 3.2, an outgoing state is implicit in 
%%% the sequential order of a state list. This is what "State2 is 
%%% Statel +1" in list_acrs/3 is all about. 

%%% - 

list_arcs (Statel, [State2] , [Arc] ) 

State2 =\= 0, 

Arc = (Statel, State2 ) . 
list_arcs( Statel, [State2] , [] ) : - 

State 2 ==» 0. 

list_arcs( Statel, [State2 | [State4 | RestStates] ] , [Arc | RestArcs ] ) : - 
State3 is Statel + 1, 
State2 =\» 0, 
Arc = ( Statel, State2 ) , 

list_arcs ( State3 , [State4 | RestStates ] , RestArcs) . 
list_arcs( Statel, [State2| [State4 | RestStates] ] , Arcs):- 
State3 is Statel + 1, 

% the incoming state is 0 (i.e. failure), 
% don't add this state pair to MACHINE. 
State2 == 0, 

list_arcs(State3, [State4 (RestStates] , Arcs) . 



%%% Instantiate the "==" pair 
%%% 

elsewhere ( (Lit_Pair, StateList) , [ ( RN, FSL) [Automaton] , 
[ (Else_Lit_Pair, StateList) |RestPairs] ) :- 
allJLitjpairs ( [x|y] ) , 
assert ( statelist( StateList) ) ,- 
r emove_spec ( [ X | Y ] , Automaton , 

[ (Else_Lit_Pair, StateList) |RestPairs] ), 
retract (statelist( StateList) ) . 

%%% Replace a variable pair by the corresponding pairs 
%%% e.g. "W", U CC", "SS", etc. 

%%% 

replace_variable( ( [Var,Var] , StateList) , Automaton, 

[(Spec_Pair, StateList) | RestPairs ] ) : - 
abbrev( [Var] , Alphabet) , 
make_pairs (Alphabet, Alphabet_Pairs ) , 
assert ( statelist( StateList) ) , 
remove_spec ( Alphabet_Pairs , Automaton, 

[ (Spec_Pair, StateList) | RestPairs] ) , 
retract(statelist(StateList) ) . 

%%% e.g. "V=" , "C-"/ etc. 
%%% 

replace_variable( ( [Var, Wild] , StateList) , Automaton, 

[ ( Spec_Pair , StateList ) | RestPairs ] ) : - 
abbrev( [Var] , Alphabet) , 
[Wild] == "=", 
all_litjpairs( [x|y] ) , 

make_else_pairs 2 (Alphabet , [X | Y] , Else_Pairs ) , 
assert ( statelist( StateList) ) , 
remove_spec ( Else_Pairs , Automaton, 

[(Spec_Pair, StateList) [RestPairs] ), 
retract (statelist( StateList) ) . 



30 

27 



%%% Constant-Any pair 
%%% e.g. "+=", n t- H , etc 
%%% - 

replace_variable( ( [Cons,Var] , StateList) , Automaton, 

[ (SpecJPair, StateList) | RestPairs ] ) : - 

[Var] «« " = ", 
all_litjpairs( [x|y] ) , 

make_elsejpairs(Cons, [x| Y] , ElseJPairs) , 
assert (statelist( StateList) ) , 
remove_spec ( ElseJPairs , Automaton , 

[(SpecJPair, StateList) | RestPairs] ), 
retract (statelist( StateList) ) . 

%%% Variable-Constant pair 
%%% - 

replace_yariable( ( [Var, Cons] , StateList) , Automaton, 

[(SpecJPair, StateList) | RestPairs] ) : - 
abbrev( [Var] , Alphabet) , 
makejpairs (Alphabet, AlphabetJPairs ) , 
make_elsejpairsl(Cons, AlphabetJPairs, ElseJPairs) , 
assert ( statelist( StateList) ) , 
remove_spec ( ElseJPairs , Automaton , 

[(SpecJPair, StateList) | RestPairs] ), 
retract (statelist( StateList) ) . 

%%% Constant-Variable pair 
%%% e.g. "+C" 

%%% -- 

replace_variable( ( [Cons, Var] , StateList) , Automaton, 

[(SpecJPair, StateList) | RestPairs] ) : - 
abbrev( [Var] , Alphabet) , 
makejpairs (Alphabet, AlphabetJPairs ) , 
make_elsejpairs ( Cons , AlphabetJPairs, Else_Pairs), 
assert ( statelist( StateList) ) , 
remove^spec ( Else JPairs , Automaton , 
[ (SpecJPair, StateList) | RestPairs] ) , 
retract (statelist( StateList) ) . 



make_pairs ( [ ] , [ ] ) . 
make_jpairs( [ [X] | Y] , [ [X,X] | Z] ) : - 
makejpairs (Y, Z) . 

make_else_pairs (A, [], []). 

make_else_pairs(A, [ [X,Y] | Z] , [Pair | RestPairs ] ) : - 
A — X, ! , % in case "Lex" is a variable 

[X,Y] = Pair, 

make_elsejpairs (A, Z, RestPairs) . 
make_else_pairs (A, [ [X,Y] | Z] , Pairs) : - 
make_elsej?airs(A, Z, Pairs) . 

make_else_pairsl(A, [ ] , [ ] ) . 

make_else_pairsl(A, [ [X, Y] | Z] , [Pair | RestPairs] ) : - 
A == Y, ! , % in case "Surf" is a variable 

[X,Y] = Pair, 

make_elsej?airsl(A, Z, RestPairs) . 
make_else_pairsl(A, [ [X, Y] | Z] , Pairs) : - 
make_elsejpairsl(A, Z, Pairs) . 
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make__else_jpairs2 ( [ ] , All, []). 

make_elsejpairs2( [ [A] |B] , [x|y] , Else_Pairs) : - 
make__else_jpairs(A, [x|y] , Else_Pairsl) , 
make_else_jpairs2 (B, [X | Y] , Else_Pairs2 ) , 
conc(Else__Pairsl, Else_Pairs2, Else__Pairs) . 



% The termination condition has been complicated a bit 
% due to the non- correspondence of else pairs to all pairs 
remove__spec ( [ ] , Automaton, [ ] ) . 
remove_spec ( [X] , Automaton, [ ] ) : - 

is__specif ied(X, Automaton) . 
remove_spec( [X] , Automaton, [ ( Else__Lit_J?air , ElseStateList) ] ) : - 

statelist(ElseStateList) , 

X « Blse_Lit_Pair , 

remove_spec ( [ ] , Automaton, [ ] ) . 

% If the literal pair is specified in the automaton, do nothing. 
% Otherwise, add the pair to the new automaton 
remove_spec( [x|y] , Automaton, ElsePairs) :- 

is_specif ied(X, Automaton), !, 

remove__spec ( Y, Automaton , ElsePairs ) . 
r emove__spec ( [ X | Y ] , Automaton , 

[(Else_Lit__Pair, ElseStateList) |Rest] ) : - 

statelist( ElseStateList) , 

X = Else_Lit__Pair , 

remove_spec( Y, Automaton, Rest) . 



% Think of it as the member/ 2 predicate 

is_specif ied(Lit__Pair, [ (Lit_Pair, StateList) | RestAutomaton] ) . 
is__specif ied(X, [ (Lit__Pair, StateList) | RestAutomaton] ) : - 
is__specif ied(X, RestAutomaton) . 
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** 

** The English Lexicon 

** 

alternation ( in, [cl] ) . 

alternation( v, [p3, ps, pp, pr, i, ag, ab] ) . 
alternation ( ivl, [pr, i, ag, ab] ) . 
alternation (iv2, [p3, pr, i, ag, ab] ) . 
alternation (a, [pa, ca, cs ,ly]) . 
alternation ( # , [ ] ) . 
alternation (Cat, [Cat] ) . 

lexicon(n, [0], [ [cl, "N SG" ] ] ) . 
lexicon(n, "+s", [[c2, "N PL"]]). 
lexicon(mn, [0], [[cl, "MASS N" ] ] ) . 
lexicon (cl, [0], [[#, ""]]). 
lexicon(cl, " ■ s " , [[#, "GEN"]]). 
lexicon(c2, [0], [[#, ""]]). 
lexicon (c2, '"", [[#, "GEN"]];. 
lexicon(p3, "+s", [[#, "V PRES SG 3RD"]]). 
lexicon(ip3, [0], [[#, "V PRES 3RD SING"]]). 
lexicon(ps, "-t-ed", [[#, "V PAST"]]). 
lexicon(ips, [0], [[#, "V PAST"]]), 
lexicon (pp, "+ed", [[#, "V PAST PRT " ] ] ) . 
lexicon(ipp, [0], [[#, "V PAST PRT 11 ]]). 
lexicon(pr, "+ing", [[#, "V PROG"]]). 
lexicon(i, [0], [[#, "V"]]). 

lexicon (ipl, [0], [[#, "V PRES SING, 1ST" ] ] ) . 
lexicon(ag, "+er" , [[n, "AG"]]). 
lexicon(pa, [0], [[#, M A" ] ] ) . 
lexicon (ca, "+er", [[#, "ACOMP"]]). 
lexicon(cs, "+est", [[#, "A SUP"]]), 
lexicon (ly, "ly" , [[#, "ADV" ] ] ) . 
lexicon(ab, M +able", [[#, "VERB ABL" ] ] ) . 



ERLC 



lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 
lexicon 



( root 
(root 
( root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 
(root 



ii it 
H ii 



]]>. 
]]>. 



"are", [[#, "V PRES SING 2ND"]]), 
"at", [[#, "PREP"]]), 
"a-ttack", [[n, ""], [v, 
"be", [[#, "AUX"], [ivl, 
"beer", [ [n, ""]]). 
"believe", [[v, ""]]). 
"big", [[a, ""]]). 
"bit", [[ips, ""]]). 
"bite", [[iv2, ""]]). 
"bitten", [ [ipp, ""]]). 
"boo", [[v, ""]]). 
"boy", [[n, ""]]). 
"cacti" , [ [in, "N PL"] ] ) . 
"cactus", [ [in, "N SG"]] ) 
"cat", [[n, ""]]). 
"church", [[n, ""]]). 
"cool", [[a, ""]]). 
"day", [[n, ""]]). 
"did", [[ips, ""]]). 
"die", [[v, ""]]). 
"do", [[ivl, ""]]). 
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BEST COPY AVAILABLE 



lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon ( root 
lexicon (root 
lexicon (root 
lexicon ( root 
lexicon ( root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 
lexicon (root 



1 does " , 
'done " , 
'fox 
'go" , 
"goes " 
gone 



" "])>. 
""]]). 
" " 1 1 ) 



[UP3, ""]]). 
[[ipp, ""]]). 
, [[n, ""]]). 
[[ivl, ""]]). 
, E[ip3, ,,M ]]). 
, [[ipp, ""]]). 
grouch" , [ [n, " " ] ] ) . 
had", [[#, "AUX ,: ], [ips, 
has", [[#, "AUX"], [ip3, 
have", [[#, "AUX"], [ivl, 
ice", [[mn, " " ] ] ) . 
"industry", [ [n, ""]]). 
"is", [[ip3, ""]]). 
"kill", [[v, ""]]). 
kiss", [[n, ""], [v, ""]]) 
"mice", [ [in, "N PL"] ] ) . 
•milk", [[mn, ""]]). 
'mouse" , [ [in, "N SG"] ] ) . 
'move" , [ [v, ""]]). 
■oc^cur", [ [v, " "] ] ) . 
■race", [ [v, ""],[n, ""]]) 
■rally", [[n, ""]]). 
•referee", [ [v, " " ] , [n, " 
■reefer", [ [v, ""]]). 
•ski", [[n, ""]]). 
■sleep", [[iv2, ""]]). 
■slept", [[ipp, "■■], [ips, ""]]) 
" " '"]]). 



[v, ""]]) 



*]]) 



"spy", [[n, "" 


] / 


[v, 


'tie", [[v, 


]]) 




'tiptoe", [[v, 


ll ll 


1 1 ) 


'toe", [[n, "" 


] ]) 




'travel" , [ [n, 


It if 


]/ 


"try", [[v, "" 


]]) 





un' 



un , [ [root, "NEG " ] ] ) . 
"understand" , [ [iv2, ""]]). 
.... , ,„ ... " " ] , [ips, 



'understood", [[ipp, 11 11 ] , 
'undid", [[ips, ""]]). 
'undo", [ [ivl, ""]]). 
•undoes", [ [ip3, ""]]). 
' undone " , [ [ ipp , " " ] " * 
"untie", [[v, ""]]). 
"went", [[ipp, ""]]) 
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** 
** 



The English Automata 



**********************************************^ 

%%% This file contains the finite state automata which encode six 

%%% morphophonological rules in English, as described in Karttunen and 

%%% Wittenburg (1983). 



alphabet ( ["a" , »'b","c", "d", "e","f ", "g" , "h" , "i" , " j " , "k" , "1", "m" , "n", "o" 



II y^H It n H _ It y II It It II f. II II 11 It II Tr It II ^j. It It v II II T7 II II ry If lit l| 



I — . II II . II II <y II II II II +. II II > 1 

alphabet ( any , " - " ) . 
abbrev( "V", [ "a", "e" , "i" , "o" , "u"]) . 
abbrev( "C" , [ "b 

"P 
•s 



*x" 



/ [0]]). 



abbrev("S" , 



automata ( [ [ 



dll II -f It It „ II It U II \\ J\ || || -l-ii if *| II il— II II _ II 

, "3 
Z"]) • 



surface, 



[0,0], 


[i 


"aa", 


en 


"bb" , 


[i] 


"cc", 


en 


"dd", 


en 


"ee", 


[ii 


"ff ", 


ti] 


"gg", 


[ii 


"hh" , 


[13 


"ii", 


[1] 


"jj"/ 


[1] 


"kk" , 


[1] 


"11", 


[1] 


"mm" , 


[1] 


"nn", 


[1] 


"oo", 


[1] 


"PP" / 


[1] 


"qq"/ 


[1] 


"rr", 


[1] 


"S3" , 


[1] 


"tt", 


[1] 


"uu" , 


[1] 


" vv " , 


[1] 


"WW" , 


[1] 


"XX" , 


[1] 


"YY", 


[1] 


"zz", 


[1] 


ii i > ii 


[1] 


M __ _ ii 


[1] 



true] ) , 
) / 



] , 



i_spelling, [true, false, false, false, true, true, false] ) , 

"iy", [2,0,0,0,1,0,0]), 

[101,0], [1,3,0,0,1,1,0]), 

[43,0], [1,0,4,0,1,7,0]), 

"ii", [5,0,0,1,1,0,0]), 

"ee", [1,0,0,0,6,0,0]), 

"==", [1,0,0,0,1,1,1])], 
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[ (elision, [true, true, true /false, true, false /false, false, false, 
false, true, false, false, true, true] ) , 
"W", [2,1,1,0,1,0,1,1,0,0,1,0,0,1,1] ), 
"ii", [2,1,1,0,1,0,1,1,0,0,1,0,1,1,0] ), 
"ee\ [3,5,5,0,1,0,0,1,0,1,14,0,1,1,0] ), 
[101,0] , [4,6,6,0,4,0,0,0,0,0,12,0,0,0,0]), 
[43,0] , [1,1,9,8,7,10,0,0,0,0,1,13,0,15,1] ), 
"gg", [11,11,11,0,11,0,1,0,1,0,1,0,0,11,11] ), 
"cc", [11,11,11,0,11,0,1,0,1,0,1,0,0,11,11] ), 
"bb", [5,1,5,0,1,0,1,0,1,0,1,0,0,1,1] ), 
1,0,1,0,1,0,1,0,1,0,0,1,1] )] , 



"==\ [1,1 

epenthesis 
"cc", [2,2 
"hlx", [1,3 
"ss", [4,3 
"SS", [3,3 
"yi", [3,3 
"+e" , [0,0 
[43,0], [1 
" — "v [1/1 



gemination 



" VV" 
"bb " 
"dd" 
"ff " 

"gg" 
"ii 

"mm ■ 
"nn" 

"PP" 
"rr " 
"ss" 
"tt" 

.1 +b H 
.1 +d ., 

"+f " 

"+g " 

"+m" 

"+n" 

"+p" 
.. +r .. 

"+s" 
"+t" 
[43,0] 
[96,0] 



[4,1 
[1,0 
[1/0 
[1/0 
[1/0 
[1/0 
[1,0 
[1/0 
[1/0 
[1,0 
[1/0 
[1,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 
[0,0 

, [1 
, [1 

[1,0 



[true, true, true, true, false, true] ) , 
2,2,0,1] ), 
1,3,0,1] ), 
3,3,1,0]), 
3,3,0,1]), 
3,3,0,1]), 
5,5,0,1] ), 
1,6,6,0,1] ) , 
1,1,0,1])], 

[true, false, true, true, true, true, true, true, true, 
true, true, true, true, true, true, true] ) , 
0,16,16,16,16,16,16,16,16,16,16,16,16, 16] 
0,5,16,16,16,16,16,16,16,16,16,16,16,16] ), 
0,6,16,16,16,16,16,16,16,16,16,16,16,16] ), 
0,7,16,16,16,16,16,16,16,16,16,16,16,16] ) , 
0,8,16,16,16,16,16,16,16,16,16,16,16,16] ) ( 
0,9,16,16,16,16,16,16,16,16,16,16,16,16] ) , 
0 , 10 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 ] 
0,11,16,16,16,16,16,16,16,16,16,16,16,16] 
0 , 12 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 , 16 ] 
0,13,16,16,16,16,16,16,16,16,16,16,16,16] 
1,14,16,16,16,16,16,16,16,16,16,16,16,16] 
0,15,16,16,16,16,16,16,16,16,16,16,16,16] 
0,0,2,0,0,0,0,0,0,0,0,0,0,0]), 
0,0,0,2,0,0,0,0,0,0,0,0,0,0]), 
0,0,0,0,2,0,0,0,0,0,0,0,0,0]), 
0,0,0,0,0,2,0,0,0,0,0,0,0,0]), 
0/0/0/0/0/0/2/0/0/0/0/0/0/0]), 
0,0,0,0,0,0,0,2,0,0,0,0,0,0]), 
0,0,0,0,0,0,0,0,2,0,0,0,0,0]), 
0,0,0,0,0,0,0,0,0,2,0,0,0,0]), 
0,0,0,0,0,0,0,0,0,0,2,0,0,0]), 
0,0,0,0,0,0,0,0,0,0,0,2,0,0]), 
0,0,0,0,0,0,0,0,0,0,0,0,2,0]), 
0,0,1,3,3,3,3,3,3,3,3,3,3,3,16]), 
1,1,1,1,1,1,1,1,1,1,1,1,1,1,1] ), 
0,16,1,1,1,1,1,1,1,1,1,1,1,16] )], 
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[ (y_s pel ling, [true, true, false, false, true, false] ) , 



("CC", 


[2,2,0,1, 


1,0]), 


( "YY" , 


[1,5,0,1, 


1,0]), 


("yi", 


[0,3,0,0, 


0,0]) , 


('+-"/ 


[1,1,4,1, 


6,0]), 


("ii" , 


[1,1,0,0, 


1,1]), 


( "aa" , 


[1,1,0,0, 


1,1]), 


("==", 


[1,1,0,1, 


1,0])]]) 
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** 

** Utilities 
* * 

%%% This file contains some utility functions used by the compiler. 

delete_pair ( [ ] , []):- !. 
delete_pair ( [ Kill | Tail ] , Kill , Rest ) : - 

deletejpair (Tail, Kill, Rest) . 
delete_pair ( [Head|Tail] , Kill, [Head | Rest] ): - 

delete_pair (Tail, Kill, Rest) . 

%%% This sorting program is basically the same as in Shieber and 
%%% Pereirra (1986) except that a few minor changes have been added 
%%% to the "merge" clauses. 
%%% 

mergesort( [] , [ ] ) . 
mergesort( [A] , [A] ) . 



mergesort( [A,B 
split( [A,B 



Rest] , Sorted) : - 
Rest] , LI, L2) , 
mergesort ( LI , SortedLl ) , 
mergesort ( L2 , SortedL2 ) , 
merge (SortedLl, SortedL2, Sorted). 



split([], [], []). 
split([A], [A], []). 

split( [A, B | Rest] , [A | RestA] , [B | RestB] ) 
split(Rest, RestA, RestB). 

merge (A, [] , A) . 
merge([], B , B) . 



RestAs] , [ ( [Lex2,Surf 2] ,S2) |RestBs] , 
Merged] ) : - 
Lexl+Surfl < Lex2+Surf2, 

merge(RestAs, [ ( [Lex2,Surf 2] ,S2) | RestBs] , Merged) , 



merge ( [ ( [Lexl,Surfl] ,S1) 
[( [Lexl,Surfl] ,S1) 



merge ( [ ( [ Lexl , Surf 1 ] , SI ) 
[ ( [Lex2,Surf2] ,S2) 



RestAs] , [ ( [Lex2,Surf 2] ,S2) |RestBs] , 
Merged] ) : - 
Lex2+Surf2 < Lexl+Surfl, 

merge ( [( [Lexl, Surf 1] ,S1) | RestAs] , RestBs, Merged), 
merge ( [ ( [Lexl, Surf 1] ,S1) RestAs] , [ ( [Lex2,Surf 2] ,S2) | RestBs ] , 
[( [Lexl, Surf 1] , SI) Merged]):- 
Lex2+Surf2 =:= Lexl+Surfl, 
Lexl < Lex 2, 

merge ( RestAs , '[( [Lex2,Surf2] ,S2) | RestBs] , Merged). 
merge( [ ( [Lexl,Surfl] ,S1) RestAs] , [ ( [Lex2,Surf 2] ,S2) |RestBs] , 
[ ( [Lex2,Surf2] ,S2) Merged] ): - 
Lex2+Surf2 Lexl+Surfl, 
Lex2 < Lexl, 

merge ( [ ( [Lexl, Surf 1] ,S1) | RestAs] , RestBs, Merged) . 
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